A Novel Ant-Based Clustering Approach for Document Clustering

نویسندگان

  • Yulan He
  • Siu Cheung Hui
  • Yongxiang Sim
چکیده

Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant Colony Optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper proposes a novel document clustering approach based on ACO. Unlike other ACO-based clustering approaches which are based on the same scenario that ants move around in a 2D grid and carry or drop objects to perform categorization. Our proposed ant-based clustering approach does not rely on a 2D grid structure. In addition, it can also generate optimal number of clusters without incorporating any other algorithms such as K-means or AHC. Experimental results on the subsets of 20 Newsgroup data show that the ant-based clustering approach outperforms the classical document clustering methods such as K-means and Agglomerate Hierarchical Clustering. It also achieves better results than those obtained using the Artificial Immune Network algorithm when tested in the same datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

An ant colony approach for clustering pdf

This paper presents an ant colony optimization methodology for optimally clustering N objects into K clusters. The algorithm employs distributed agents which. AbstractAnt-based clustering is a biologically inspired data. Multi-ant colonies approach for clustering data that consists of some parallel.based on ant colony to solve the unsupervised clustering. Index TermsAnt colony optimization, Clu...

متن کامل

Hybrid ANFIS with ant colony optimization algorithm for prediction of shear wave velocity from a carbonate reservoir in Iran

Shear wave velocity (Vs) data are key information for petrophysical, geophysical and geomechanical studies. Although compressional wave velocity (Vp) measurements exist in almost all wells, shear wave velocity is not recorded for most of elderly wells due to lack of technologic tools. Furthermore, measurement of shear wave velocity is to some extent costly. This study proposes a novel methodolo...

متن کامل

A Novel Clustering Approach for Estimating the Time of Step Changes in Shewhart Control Charts

  Although control charts are very common to monitoring process changes, they usually do not indicate the real time of the changes. Identifying the real time of the process changes is known as change-point estimation problem. There are a number of change point models in the literature however most of the existing approaches are dedicated to normal processes. In this paper we propose a novel app...

متن کامل

Tabu-KM: A Hybrid Clustering Algorithm Based on Tabu Search Approach

  The clustering problem under the criterion of minimum sum of squares is a non-convex and non-linear program, which possesses many locally optimal values, resulting that its solution often falls into these trap and therefore cannot converge to global optima solution. In this paper, an efficient hybrid optimization algorithm is developed for solving this problem, called Tabu-KM. It gathers the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006